Practical Machine Learning for Software Engineering and Knowledge Engineering

نویسنده

  • Tim Menzies
چکیده

Machine learning is practical for software engineering problems, even in datastarved domains. When data is scarce, knowledge can be farmed from seeds; i.e. minimal and partial descriptions of a domain. These seeds can be grown into large datasets via Monte Carlo simulations. The datasets can then be harvested using machine learning techniques. Examples of this knowledge farming approach, and the associated technique of data-mining, is given from numerous software engineering domains. Machine learning (ML) is not hard. Machine learners automatically generate summaries of data or existing systems in a smaller form. Software engineers can use machine learners to simplify systems development. This chapter explains how to use ML to assist in the construction of systems that support classification, prediction, diagnosis, planning, monitoring, requirements engineering, validation, and maintenance. This chapter approaches machine learning with three specific biases. First, we will explore machine learning in data-starved domains. Machine learning is typically proposed for domains that contain large datasets. Our experience strongly suggests that many domains lack such large datasets. This lack of data is particularly acute for newer, smaller software companies. Such companies lack the resources to collect and maintain such data. Also, they have not been developing products long enough to collect an appropriately large dataset. When we cannot mine data, we show how to farm knowledge by growing datasets from domain models. Second, we will only report mature machine learning methods; i.e. those methods which do not require highly specialized skills to execute. This second bias rules out some of the more exciting work on the leading edge of machine learning research (e.g. horn-clause learning). Third, in the author’s view, it has yet to be shown empirically from realistic examples that a particular learning technique is necessarily better than the others . When faced with arguably equivalent techniques, Occam’s razor suggests we use the simplest. We hence will explore simple decision tree learners in this chapter. Decision tree learners execute very quickly and are widely used: many of the practical SE applications of machine learning use decision tree learners like C4.5 [33] or the CART For evidence of this statement, see the comparisons of different learning methods in [34, 17, 36]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Software Quality Modeling with Limited Apriori Defect Data

In machine learning the problem of limited data for supervised learning is a challenging problem with practical applications. We address a similar problem in the context of software quality modeling. Knowledge-based software engineering includes the use of quantitative software quality estimation models. Such models are trained using apriori software quality knowledge in the form of software me...

متن کامل

Constructing Engineering Knowledge: development of an online learning environment

With the development of new undergraduate degree programmes within Murdoch Universiv 's School of Engineering, the decision was made to ofer courses, as much as practical, online. This provides numerous challenges to be addressed, including considerations of curriculum design and learning issues. Within the Software Engineering program, an inj?astructure has been developed to address these issu...

متن کامل

Machine Learning and Value-Based Software Engineering

Software engineering research and practice thus far are primarily conducted in a value-neutral setting where each artifact in software development such as requirement, use case, test case, and defect, is treated as equally important during a software system development process. There are a number of shortcomings of such value-neutral software engineering. Value-based software engineering is to ...

متن کامل

An Environment for Project-Based Collaborative Learning of Software Design Patterns*

Software engineering education faces increasing pressure to provide students with those skills required to solve different kinds of software problems both, alone or as a member of a development team. Consequently, one of the main goals of software engineering curriculum is to teach students how to model, design and implement software, as well as how to exploit previous successful experiences an...

متن کامل

Applying Machine Learning Algorithms in Software Development

Machine learning deals with the issue of how to build programs that improve their performance at some task through experience. Machine learning algorithms have proven to be of great practical value in a variety of application domains. They are particularly useful for (a) poorly understood problem domains where little knowledge exists for the humans to develop effective algorithms; (b) domains w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000